{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "p3A9q4LNh1bL"
      },
      "source": [
        "##### Copyright 2025 Google LLC."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "cellView": "form",
        "id": "KGxPOhGBh2Xy"
      },
      "outputs": [],
      "source": [
        "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sP8PQnz1QrcF"
      },
      "source": [
        "# Gemini API: Basic information extraction"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bxGr_x3MRA0z"
      },
      "source": [
        "<a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/prompting/Basic_Information_Extraction.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" height=30/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ysy--KfNRrCq"
      },
      "source": [
        "This example notebook shows how Gemini API's Python SDK can be used to extract information from a block of text and return it in defined structure.\n",
        "\n",
        "In this notebook, the LLM is given a recipe and is asked to extract all the ingredients to create a shopping list. According to best practices, complex tasks will be executed better if divided into separate steps, such as:\n",
        "\n",
        "1. First, the model will extract all the groceries into a list.\n",
        "\n",
        "2. Then, you will prompt it to convert this list into a shopping list.\n",
        "\n",
        "You can find more tips for writing prompts [here](https://ai.google.dev/gemini-api/docs/prompting-intro).\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "Ne-3gnXqR0hI"
      },
      "outputs": [],
      "source": [
        "%pip install -U -q \"google-genai>=1.0.0\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eomJzCa6lb90"
      },
      "source": [
        "## Configure your API key\n",
        "\n",
        "To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "v-JZzORUpVR2"
      },
      "outputs": [],
      "source": [
        "from google import genai\n",
        "from google.colab import userdata\n",
        "\n",
        "GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n",
        "client = genai.Client(api_key=GOOGLE_API_KEY)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qgZhPe08uHkG"
      },
      "source": [
        "Additionally, select the model you want to use from the available options below:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "id": "jm5jSLXguIMN"
      },
      "outputs": [],
      "source": [
        "MODEL_ID = \"gemini-2.5-flash\"  # @param [\"gemini-2.5-flash-lite\", \"gemini-2.5-flash\", \"gemini-2.5-flash\",\"gemini-2.5-pro\",\"gemini-3-pro-preview\"] {\"allow-input\": true, \"isTemplate\": true}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "L-Wt23A_uzFZ"
      },
      "source": [
        "## Example\n",
        "\n",
        "First, start by extracting all the groceries. To dod this, set the system instructions when defining the model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "id": "x-Mf5-Vsw2Ft"
      },
      "outputs": [],
      "source": [
        "from google.genai import types\n",
        "\n",
        "groceries_system_prompt = f\"\"\"\n",
        "  Your task is to extract to a list all the groceries with its quantities based on the provided recipe.\n",
        "  Make sure that groceries are in the order of appearance.\n",
        "\"\"\"\n",
        "\n",
        "grocery_extraction_config =  types.GenerateContentConfig(\n",
        "    system_instruction=groceries_system_prompt\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4YJRWBbviSeC"
      },
      "source": [
        "Next, the recipe is defined. You will pass the recipe into `generate_content`, and see that the list of groceries was successfully extracted from the input."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "id": "yebFPUvcxDdZ"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "- 3 garlic cloves\n",
            "- knob of fresh ginger\n",
            "- 3 spring onions\n",
            "- 2 tbsp of clear honey\n",
            "- 1 orange\n",
            "- 1 tbsp of light soy sauce\n",
            "- 2 tbsp of vegetable oil\n",
            "- 4 small chicken breast fillets\n",
            "- 20 button mushrooms\n",
            "- 20 cherry tomatoes\n",
            "- 2 large red peppers\n",
            "\n"
          ]
        }
      ],
      "source": [
        "recipe = \"\"\"\n",
        "  Step 1:\n",
        "  Grind 3 garlic cloves, knob of fresh ginger, roughly chopped, 3 spring onions to a paste in a food processor.\n",
        "  Add 2 tbsp of clear honey, juice from one orange, 1 tbsp of light soy sauce and 2 tbsp of vegetable oil, then blend again.\n",
        "  Pour the mixture over the cubed chicken from 4 small breast fillets and leave to marnate for at least 1hr.\n",
        "  Toss in the 20 button mushrooms for the last half an hour so the take on some of the flavour, too.\n",
        "\n",
        "  Step 2:\n",
        "  Thread the chicken, 20 cherry tomatoes, mushrooms and 2 large red peppers onto 20 wooden skewers,\n",
        "  then cook on a griddle pan for 7-8 mins each side or until the chicken is thoroughly cooked and golden brown.\n",
        "  Turn the kebabs frequently and baste with the marinade from time to time until evenly cooked.\n",
        "  Arrange on a platter, and eat with your fingers.\n",
        "\"\"\"\n",
        "\n",
        "grocery_list = client.models.generate_content(\n",
        "    model=MODEL_ID,\n",
        "    contents=recipe,\n",
        "    config=grocery_extraction_config\n",
        ")\n",
        "print(grocery_list.text)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w0IH1dd3jSes"
      },
      "source": [
        "The next step is to further format the shopping list based on the ingredients extracted."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "id": "sU0pld4QQqOe"
      },
      "outputs": [],
      "source": [
        "shopping_list_system_prompt = \"\"\"\n",
        "  You are given a list of groceries. Complete the following:\n",
        "  - Organize groceries into categories for easier shopping.\n",
        "  - List each item one under another with a checkbox [].\n",
        "\"\"\"\n",
        "\n",
        "shopping_list_config = types.GenerateContentConfig(\n",
        "    system_instruction=shopping_list_system_prompt\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ea84Nf2rkWX9"
      },
      "source": [
        "Now that you have defined the instructions, you can also decide how you want to format your grocery list. Give the prompt a couple examples, or perform few-shot prompting, so it understands how to format your grocery list."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "id": "3QSf7m5QxmC-"
      },
      "outputs": [
        {
          "data": {
            "text/markdown": "## VEGETABLES\n- [ ] 3 garlic cloves\n- [ ] knob of fresh ginger\n- [ ] 3 spring onions\n- [ ] 20 button mushrooms\n- [ ] 20 cherry tomatoes\n- [ ] 2 large red peppers\n\n## FRUITS\n- [ ] 1 orange\n\n## MEAT\n- [ ] 4 small chicken breast fillets\n\n## SAUCES & OILS\n- [ ] 2 tbsp of clear honey\n- [ ] 1 tbsp of light soy sauce\n- [ ] 2 tbsp of vegetable oil\n",
            "text/plain": [
              "<IPython.core.display.Markdown object>"
            ]
          },
          "execution_count": 8,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "from IPython.display import Markdown\n",
        "\n",
        "shopping_list_prompt = f\"\"\"\n",
        "  LIST: 3 tomatoes, 1 turkey, 4 tomatoes\n",
        "  OUTPUT:\n",
        "  ## VEGETABLES\n",
        "  - [ ] 7 tomatoes\n",
        "  ## MEAT\n",
        "  - [ ] 1 turkey\n",
        "\n",
        "  LIST: {grocery_list.text}\n",
        "  OUTPUT:\n",
        "\"\"\"\n",
        "\n",
        "shopping_list = client.models.generate_content(\n",
        "    model=MODEL_ID,\n",
        "    contents=shopping_list_prompt,\n",
        "    config=shopping_list_config\n",
        ")\n",
        "\n",
        "Markdown(shopping_list.text)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PhttRO0TD9mN"
      },
      "source": [
        "## Next steps\n",
        "\n",
        "Be sure to explore other examples of prompting in the repository. Try creating your own prompts for information extraction or adapt the ones provided in the notebook."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "name": "Basic_Information_Extraction.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
